The NeoSpan project develops a comprehensive, dedicated pipeline for neoantigen (i.e. tumor-specific antigens that can be recognized by the immune system) retrieval that considers the spatial complexity of tumor microenvironments and enables reliable identification of immunogenic targets.
Neoantigens hold great potential as targets for personalized cancer vaccines and immunotherapies, especially when their spatial context is considered, revealing patterns of immune accessibility and immune microenvironment dynamics that are otherwise obscured in bulk analyses. To this end, creating a pipeline that integrates multiple data modalities (e.g. genomics, transcriptomics, and spatial coordinates) with mutation and neoantigen prediction, empowers the accurate identification of neoantigens and assess their immunogenic potential.
The herein project aims to catalyze the design and implementation of an open-source pipeline prototype specifically for retrieving and spatial mapping neoantigens within the context of spatial omics data.
It can be divided into four primary categories:
Those categories are discussed below.
When investigating neoantigens, only the mutations that are transcribed and translated into proteins matter (since only those can become neoantigens). Therefore, mutation calling from spatial RNA-seq data will highlight the expressed mutations (translation potential) and spot their location in the tissue (e.g. tumor edge, immune region). Furthermore, by definition of the neoantigen, the mutations need to appear only in the tumor cells. To this end, for mutation calling we use a modified version of the SComatic tool, that, apart from the spatial RNA-seq data, it also incorporates region annotation information (e.g. tumor vs normal). Overall, its advantages as a mutation calling technique towards neoantigen prediction include
The methodology is displayed in the following figure.
We note here that this method is useful for identifying active mutations for neoantigen prediction, but not efficient for discovering mutations in general, since spatial RNA-seq datasets are characterized by low coverage, low sensitivity and specificity for mutation detection. Besides, they only detected mutation in transcribed genes and only at sites with sufficient expression.
The output of the mutation calling (VCF file) is then used an input for the neoantigen prediction.
For exploring MHC-binding neoantigen candidates, we use the pVACseq tool. This additionally requires
Regarding the HLA typing, there are several possibilities based on the data/information available in our sample.
As soon as the we have the VEP annotation and HLA typing, we can run the pVACseq and extract the “all_epitopes.aggregated.tsv” file.
After obtaining the neoantigen prediction results, the original 10x Visium BAM file is explored to investigate whether a spot displays the desired gene mutations. Spots with the respective mutations are characterized as neoantigen-positive, otherwise negative. Only mutations of high quality (Phred > 30) are accepted.
The last category of the NeoSpan concerns the utilization of spatial and non-spatial statistics to evaluate the neoantigens and their impact on gene expression. For the easiest implementation, the NeoSpan Dashboard has been developed. This dashboard aims to provide a directional and user-friendly tool that allows the (non-)spatially evaluation of the neoantigens and their impact, among others, on gene mutations.
More specifically, via this Dashboard, the user can:
For the aforementioned functionalities, the user can create dynamic/interactive plots, that in turn simplify the data-driven analysis implementation and improve decision-making.